Table of contents

  1. Hospital Pricing

  2. Potential Outcomes Framework

  3. Selection on Observables

  4. Methods

  5. HCRIS Data

  6. Pricing and Pay for Performance

class: inverse, center, middle name: hospital_pricing

Background on Hospital Pricing


What is a hospital price?

Lots of different payers paying lots of different prices: - Medicare fee-for-service prices - Medicaid payments - Private insurance negotiations (including Medicare Advantage) - But what about the price to patients?

.center[ Price \(\neq\) charge \(\neq\) cost \(\neq\) patient out-of-pocket spending]

What is a hospital price?

Not clear what exactly is negotiated…

.pull-left[ ### Fee-for-service - price per procedure - percentage of charges - markup over Medicare rates]
.pull-right[ ### Capitation - payment per patient - pay-for-performance - shared savings]

Hospital prices in real life

We’ll get into the real data in a bit, but for now…a few facts:

  1. Hospital services are expensive

  2. Prices vary dramatically across different areas

  3. Lack of competition is a major reason for high prices

class: inverse, center, middle name: potential_outcomes

Potential Outcomes Framework


What is a “Potential Outcome”

.pull-left[ :scale 420px

Y(1)= $75,000

]

.pull-right[ :scale 370px

Y(0)= ? ]


Treatment effect =Y(1)-Y(0)= ?

Average Treatment Effect

Average Treatment Effect

class: inverse, center, middle name: selection_observables

Selection on Observables


…with Selection

…with Selection

Assumption 1: Selection on Observables

Assumption 1: Selection on Observables

More formally:



- \(E[Y(1)|W,shape]=E[Y(1)|shape]\) - \(Y(1),Y(0) \perp\!\!\!\perp W | shape\)

In words…nothing unobserved that determines treatment selection and affects your outcome of interest.

Violation of Selection on Observables

Assumption 2: Common Support

Someone of each type must be in both the treated and untreated groups

\[0 < \text{Pr}(W=1|X) <1\]
# Assumption 2: Common Support

Violation of Common Support

class: inverse, center, middle name: methods

Causal Inference with Observational Data


Fundamental Problem of Causal Inference

Causal inference with observational data

With selection on observables and common support: 1. Matching estimators: \[E[Y_{i}(1) - Y_{i}(0)] = E[ E[Y_{i}|W_{i}=1, X_{i}] - E[Y_{i}|W_{i}=0, X_{i}] ]\] 2. Regression estimators: \[E[Y_{i}(1) - Y_{i}(0)] = E[ E[Y_{i}|W_{i}=1, X_{i}] ] - E[ E[Y_{i}|W_{i}=0, X_{i}] ]\]


What’s the difference?
# Matching

Regression

Matching: The process

  1. For each observation \(i\), find the \(m\) “nearest” neighbors, \(J_{m}(i)\).
  2. Impute \(\hat{Y}_{i}(0)\) and \(\hat{Y}_{i}(1)\) for each observation:
    \[\hat{Y}_{i}(0) = \begin{cases} Y_{i} & \text{if} & W_{i}=0 \\ \frac{1}{m} \sum_{j \in J_{m}(i)} Y_{j} & \text{if} & W_{i}=1 \end{cases}\] \[\hat{Y}_{i}(1) = \begin{cases} Y_{i} & \text{if} & W_{i}=1 \\ \frac{1}{m} \sum_{j \in J_{m}(i)} Y_{j} & \text{if} & W_{i}=0 \end{cases}\]

  3. Form “matched” ATE:
    \(\hat{\delta}^{\text{match}} = \frac{1}{N} \sum_{i=1}^{N} \left(\hat{Y}_{i}(1) - \hat{Y}_{i}(0) \right)\)

Animation for matching

.center[ :scale 900px]

Regression

  1. Regress \(Y_{i}\) on \(X_{i}\) among \(W_{i}=1\) to form \(\hat{\mu}_{1}(X_{i})\)
  2. Regress \(Y_{i}\) on \(X_{i}\) among \(W_{i}=0\) to form \(\hat{\mu}_{0}(X_{i})\)
  3. Form difference in predictions:
    .center[ \[\hat{\delta}^{reg} = \frac{1}{N} \sum_{i=1}^{N} \left(\hat{\mu}_{1}(X_{i}) - \hat{\mu}_{0}(X_{i})\right)\]]

Animation for regression

.center[ :scale 900px]

Simulation: nearest neighbor matching

nn.est1 <- Matching::Match(Y=select.dat$y,
                            Tr=select.dat$w,
                            X=select.dat$x,
                            M=1,
                            Weight=1,
                            estimand="ATE")
summary(nn.est1)
## 
## Estimate...  4.0175 
## AI SE......  0.52954 
## T-stat.....  7.5869 
## p.val......  3.2863e-14 
## 
## Original number of observations..............  5000 
## Original number of treated obs...............  1732 
## Matched number of observations...............  5000 
## Matched number of observations  (unweighted).  5016

Simulation: regression

reg1.dat <- select.dat %>% filter(w==1)
reg1 <- lm(y ~ x, data=reg1.dat)

reg0.dat <- select.dat %>% filter(w==0)
reg0 <- lm(y ~ x, data=reg0.dat)
pred1 <- predict(reg1,new=select.dat)
pred0 <- predict(reg0,new=select.dat)
mean(pred1-pred0)
## [1] 4.076999

class: inverse, center, middle name: hcris

Understanding HCRIS Data


Hospital Cost Reports

.center[ :scale 800px]

The Data

hcris.data %>% 
  ggplot(aes(x=as.factor(year))) + 
  geom_bar() +
  labs(
    x="Year",
    y="Number of Hospitals",
    title="Number of Hospitals per Year"
  ) + theme_bw() +
  theme(axis.text.x = element_text(angle = 90, hjust=1))

.plot-callout[ ]

Estimating hospital prices

hcris.data <- hcris.data %>%
  mutate( discount_factor = 1-tot_discounts/tot_charges,
          price_num = (ip_charges + icu_charges + ancillary_charges)*discount_factor - tot_mcare_payment,
          price_denom = tot_discharges - mcare_discharges,
          price = price_num/price_denom)

Estimating hospital prices

.left-code[

hcris.data %>% group_by(year) %>% 
  filter(price_denom>100, !is.na(price_denom), 
         price_num>0, !is.na(price_num),
         price<100000) %>%   #<<
  select(price, year) %>% 
  summarize(mean_price=mean(price, na.rm=TRUE)) %>%
  ggplot(aes(x=as.factor(year), y=mean_price)) + 
  geom_line(aes(group=1)) +
  labs(
    x="Year",
    y="Average Hospital Price",
    title="Hospital Prices per Year"
  ) + scale_y_continuous(labels=comma) +
  theme_bw() + theme(axis.text.x = element_text(angle = 90, hjust=1))

]

.right-plot[ ]

class: inverse, center, middle name: price_pfp

Pricing and Pay for Performance


Summary stats

Always important to look at your data before doing any formal analysis. Ask yourself a few questions: 1. Are the magnitudes reasonable?

  1. Are there lots of missing values?

  2. Are there clear examples of misreporting?

Dealing with problems

We’ve adopted a very brute force way to deal with outlier prices. Other approaches include: 1. Investigate very closely the hospitals with extreme values

  1. Winsorize at certain thresholds (replace extreme values with pre-determined thresholds)

  2. Impute prices for extreme hospitals

Comparison of hospitals

Are penalized hospitals sufficiently similar to non-penalized hospitals?



Let’s look at covariate balance using a love plot, part of the library(cobalt) package.
# Love plots without adjustment
r love.plot(bal.tab(lp.covs,treat=lp.vars$penalty), colors="black", shapes="circle", threshold=0.1) + theme_bw() + theme(legend.position="none")
.plot-callout[ ]

Love plots without adjustment

1. Exact Matching

m.exact <- Matching::Match(Y=lp.vars$price,
                           Tr=lp.vars$penalty,
                           X=lp.covs,
                           M=1,
                           exact=TRUE) #<<
print(m.exact)
## [1] NA
## attr(,"class")
## [1] "Match"

1. Exact Matching (on a subset)

love.plot(bal.tab(m.exact, covs = lp.covs2, treat = lp.vars$penalty),  
          threshold=0.1, 
          grid=FALSE, sample.names=c("Unmatched", "Matched"),
          position="top", shapes=c("circle","triangle"),
          colors=c("black","blue")) + 
  theme_bw()

.plot-callout[ ]

2. Nearest neighbor matching (inverse variance)

m.nn.var <- Matching::Match(Y=lp.vars$price,
                            Tr=lp.vars$penalty,
                            X=lp.covs,
                            M=4,  #<<
                            Weight=1,
                            estimand="ATE")

v.name=data.frame(new=c("Beds","Medicaid Discharges", "Inaptient Charges",
                   "Medicare Discharges", "Medicare Payments"))

2. Nearest neighbor matching (inverse variance)

2. Nearest neighbor matching (inverse variance)

love.plot(bal.tab(m.nn.var2, covs = lp.covs, treat = lp.vars$penalty), 
          threshold=0.1, 
          var.names=v.name,
          grid=FALSE, sample.names=c("Unmatched", "Matched"),
          position="top", shapes=c("circle","triangle"),
          colors=c("black","blue")) + 
  theme_bw()

.plot-callout[ ]

2. Nearest neighbor matching (Mahalanobis)

m.nn.md <- Matching::Match(Y=lp.vars$price,
                           Tr=lp.vars$penalty,
                           X=lp.covs,
                           M=1,
                           Weight=2,
                           estimand="ATE")                           

2. Nearest neighbor matching (Mahalanobis)

2. Nearest neighbor matching (propensity score)

love.plot(bal.tab(m.nn.ps, covs = lp.covs, treat = lp.vars$penalty), 
          threshold=0.1, 
          var.names=v.name,
          grid=FALSE, sample.names=c("Unmatched", "Matched"),
          position="top", shapes=c("circle","triangle"),
          colors=c("black","blue")) + 
  theme_bw()

.plot-callout[ ]

3. Weighting

Results: Nearest neighbor

## 
## Estimate...  -526.95 
## AI SE......  223.06 
## T-stat.....  -2.3623 
## p.val......  0.01816 
## 
## Original number of observations..............  2707 
## Original number of treated obs...............  698 
## Matched number of observations...............  2707 
## Matched number of observations  (unweighted).  2711

Results: Nearest neighbor

## 
## Estimate...  -201.03 
## AI SE......  275.76 
## T-stat.....  -0.72898 
## p.val......  0.46601 
## 
## Original number of observations..............  2707 
## Original number of treated obs...............  698 
## Matched number of observations...............  2707 
## Matched number of observations  (unweighted).  14795

Results: IPW weighting with regression

ipw.reg <- lm(price ~ penalty, data=lp.vars, weights=ipw)
summary(ipw.reg)
## 
## Call:
## lm(formula = price ~ penalty, data = lp.vars, weights = ipw)
## 
## Weighted Residuals:
##    Min     1Q Median     3Q    Max 
## -18691  -4802  -1422   2651  94137 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   9876.4      147.8  66.808   <2e-16 ***
## penaltyTRUE   -196.9      211.2  -0.932    0.351    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7829 on 2705 degrees of freedom
## Multiple R-squared:  0.0003211,  Adjusted R-squared:  -4.85e-05 
## F-statistic: 0.8688 on 1 and 2705 DF,  p-value: 0.3514

Results: Regression in one step

reg.dat <- lp.vars %>% ungroup() %>% filter(complete.cases(.)) %>%
  mutate(beds_diff = penalty*(beds - mean(beds)),
         mcaid_diff = penalty*(mcaid_discharges - mean(mcaid_discharges)),
         ip_diff = penalty*(ip_charges - mean(ip_charges)),
         mcare_diff = penalty*(mcare_discharges - mean(mcare_discharges)),
         mpay_diff = penalty*(tot_mcare_payment - mean(tot_mcare_payment)))
reg <- lm(price ~ penalty + beds + mcaid_discharges + ip_charges + mcare_discharges + tot_mcare_payment + 
            beds_diff + mcaid_diff + ip_diff + mcare_diff + mpay_diff,
          data=reg.dat)

Summary of ATEs

  1. Exact matching: 1777.63
  2. NN matching, inverse variance: -526.95
  3. NN matching, mahalanobis: -492.82
  4. NN matching, pscore: -201.03
  5. Inverse pscore weighting: -196.89
  6. IPW regression: -196.89
  7. Regression: -5.85
  8. Regression 1-step: -5.85

class: inverse, center, middle name: summary

So what have we learned?


Causal effect assuming selection on observables

If we assume selection on observables holds, then we only need to condition on the relevant covariates to identify a causal effect. But we still need to ensure common support…


1. Matching 2. Reweighting 3. Regression